Fast filtering for RNA homology search

نویسندگان

  • Diana L. Kolbe
  • Sean R. Eddy
چکیده

MOTIVATION Homology search for RNAs can use secondary structure information to increase power by modeling base pairs, as in covariance models, but the resulting computational costs are high. Typical acceleration strategies rely on at least one filtering stage using sequence-only search. RESULTS Here we present the multi-segment CYK (MSCYK) filter, which implements a heuristic of ungapped structural alignment for RNA homology search. Compared to gapped alignment, this approximation has lower computation time requirements (O(N⁴) reduced to O(N³), and space requirements (O(N³) reduced to O(N²). A vector-parallel implementation of this method gives up to 100-fold speed-up; vector-parallel implementations of standard gapped alignment at two levels of precision give 3- and 6-fold speed-ups. These approaches are combined to create a filtering pipeline that scores RNA secondary structure at all stages, with results that are synergistic with existing methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets

Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...

متن کامل

Computational Identification of Micro RNAs and Their Transcript Target(s) in Field Mustard (Brassica rapa L.)

Background: Micro RNAs (miRNAs) are a pivotal part of non-protein-coding endogenous small RNA molecules that regulate the genes involved in plant growth and development, and respond to biotic and abiotic environmental stresses posttranscriptionally.Objective: In the present study, we report the results of a systemic search for identifi cation of new miRNAs in B. rapa using homology-based ...

متن کامل

Patternhunter Ii: Highly Sensitive and Fast Homology Search

Extending the single optimized spaced seed of PatternHunter to multiple ones, PatternHunter II simultaneously remedies the lack of sensitivity of Blastn and the lack of speed of Smith-Waterman, for homology search. At Blastn speed, PatternHunter II approaches Smith-Waterman sensitivity, bringing homology search technology back to a full circle.

متن کامل

FASTR3D: a fast and accurate search tool for similar RNA 3D structures

FASTR3D is a web-based search tool that allows the user to fast and accurately search the PDB database for structurally similar RNAs. Currently, it allows the user to input three types of queries: (i) a PDB code of an RNA tertiary structure (default), optionally with specified residue range, (ii) an RNA secondary structure, optionally with primary sequence, in the dot-bracket notation and (iii)...

متن کامل

Effective Query Filtering for Fast Homology Searching

To improve the accuracy of rapid homology searching it is common practice to filter all queries to mask low complexity regions prior to searching. We show in this paper, through a large-scale study of querying the PIR database, that applying popular filtering techniques unselectively to all queries may reduce retrieval effectiveness. We also show that masking queries with our new technique, caf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 27  شماره 

صفحات  -

تاریخ انتشار 2011